feat: subscribe to multiple audio tracks and add RFC 6464 audio level extension#68
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughAdds fixed-size multi-subscription support for audio tracks, implements RFC 6464 audio level RTP header extension injection and transformation for publishers, and updates the espressif/esp_peer dependency constraint. Changes
Sequence Diagram(s)mermaid Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
cccb4d7 to
402bab7
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@components/livekit/core/peer.c`:
- Around line 40-44: The code currently injects a hardcoded RTP extmap ID and a
fixed audio level: update the injection logic that uses AUDIO_LEVEL_EXTMAP_LINE
and AUDIO_LEVEL_EXT_ID to first parse the SDP for existing "a=extmap:<id>"
entries and choose an unused numeric ID (or reuse the referred URI) instead of
always using 1 to avoid duplicate extmap IDs; also update the
AUDIO_LEVEL_DEFAULT constant comment and/or its use to clearly mark it as a
placeholder (e.g., "placeholder -30 dBov for testing") and, where RTP audio
level is generated, replace the fixed value with real VAD/level measurement or
call a helper that returns a measured level so active speaker detection reflects
actual speech activity.
- Around line 106-129: The function audio_level_get_encoded_size currently
hardcodes payload type 111 which breaks when Opus is negotiated to a different
dynamic PT; update the transformer to use a negotiated payload type value pulled
from configuration/state instead of 111 — e.g., add a payload_type field to the
transformer context (ctx) or to the peer state and populate it when parsing the
SDP/answer, then replace the literal 111 check in audio_level_get_encoded_size
with that context/state value (use esp_peer_rtp_frame_t and RTP_EXT_BLOCK_SIZE
as before) so the transformer works for any negotiated Opus payload type.
🧹 Nitpick comments (1)
components/livekit/idf_component.yml (1)
15-15: Unbounded version constraint risks pulling in future breaking changes.Every other dependency in this file uses a
~(compatible) constraint, butesp_peernow uses>=1.3.0with no upper bound. A future major version bump (e.g., 2.0) could introduce breaking API changes. Consider using~1.3.0(or>=1.3.0, <2.0.0) to stay consistent and safe.- espressif/esp_peer: ">=1.3.0" + espressif/esp_peer: "~1.3.0"
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
components/livekit/core/engine.ccomponents/livekit/core/peer.ccomponents/livekit/idf_component.yml
🧰 Additional context used
📓 Path-based instructions (1)
components/**
📄 CodeRabbit inference engine (AGENTS.md)
Place LiveKit ESP32 SDK sources and supporting components in the
components/directory
Files:
components/livekit/core/engine.ccomponents/livekit/idf_component.ymlcomponents/livekit/core/peer.c
🧠 Learnings (1)
📚 Learning: 2026-01-31T02:18:39.165Z
Learnt from: CR
Repo: livekit/client-sdk-esp32 PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-31T02:18:39.165Z
Learning: Applies to components/** : Place LiveKit ESP32 SDK sources and supporting components in the `components/` directory
Applied to files:
components/livekit/idf_component.yml
🧬 Code graph analysis (1)
components/livekit/core/engine.c (1)
components/livekit/core/signaling.c (1)
signal_send_update_subscription(420-436)
🔇 Additional comments (7)
components/livekit/core/engine.c (3)
91-98: LGTM!The fixed-size array with a count field is a clean approach for this constrained environment, and
memsetincleanup_previous_connectionproperly resets the state on reconnect.
163-183: LGTM!The deduplication logic and bounded array management look correct. The
strncmpandstrlcpyboth properly usesizeof(livekit_pb_sid_t)as the limit.
710-716: LGTM!Removing the early exit ensures all participants are visited. Once max subscriptions are reached, subsequent
subscribe_trackscalls gracefully return after the limit check with minimal overhead.components/livekit/core/peer.c (4)
51-104: LGTM!The SDP injection logic correctly scopes its search to the audio media section, handles the boundary properly, and manages memory allocation with correct sizes. The fallback to
NULLwhen no direction attribute is found is appropriate.
131-176: LGTM!The RFC 5285 one-byte header extension block is correctly constructed:
0xBEDEprofile, 1-word length, proper element header with ID and length nibbles, audio level data with voice-activity bit, and padding to word boundary. The payload copy arithmetic is correct.
286-310: LGTM!The patched SDP path correctly frees the allocated string after forwarding, and the fallback to the original SDP on injection failure is sound.
486-501: LGTM!The transformer registration is correctly gated on publisher role with audio enabled, and gracefully degrades if the API call fails without blocking peer creation.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@components/livekit/core/peer.c`:
- Around line 139-171: In audio_level_transform, add a defensive NULL check for
frame->encoded_data (enc) before any writes/reads and return a non-zero error
(or propagate frame failure) if it's NULL; ensure you guard the subsequent
memcpy/enc writes and avoid modifying enc[0] etc. If available, log or set an
error code on the frame to aid debugging; keep the check at the top of
audio_level_transform immediately after assigning enc.
🧹 Nitpick comments (3)
components/livekit/core/engine.c (1)
710-716: Verify:subscribe_trackserrors are silently ignored here.The return value of
subscribe_tracks()is now discarded. This is fine forENGINE_ERR_MAX_SUB(subsequent participants' tracks just log a warning), butENGINE_ERR_INVALID_ARGwould also be silently swallowed. In practicetrackscomes from protobuf data that could havetracks_count > 0with a valid pointer, so this is low risk, but worth being aware of.components/livekit/core/peer.c (2)
51-104: SDP attribute search doesn't enforce line-boundary matching.
strstr(m_audio, "a=sendrecv")could match a substring within another attribute value (e.g., inside ana=fmtp:line). In practice SDP attributes are well-formed so this is very unlikely, but using a line-boundary-aware search (e.g., searching for"\r\na=sendrecv"or"\na=sendrecv") would be more robust.Also, if the audio section uses
a=inactive, the function returnsNULLand falls back to the original SDP, which is reasonable.
290-310: Full SDP logged at INFO level may be noisy in production.Lines 297 and 302 log the entire SDP string with
ESP_LOGI. On embedded targets with limited serial bandwidth, consider usingESP_LOGD(debug level) to match the pattern used elsewhere in this file for verbose output.♻️ Suggested change
- ESP_LOGI(TAG(peer), "Generated offer (with audio-level extmap):\n%s", patched_sdp); + ESP_LOGD(TAG(peer), "Generated offer (with audio-level extmap):\n%s", patched_sdp);- ESP_LOGI(TAG(peer), "Generated offer:\n%s", sdp); + ESP_LOGD(TAG(peer), "Generated offer:\n%s", sdp);
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
components/livekit/core/engine.ccomponents/livekit/core/peer.ccomponents/livekit/idf_component.yml
🚧 Files skipped from review as they are similar to previous changes (1)
- components/livekit/idf_component.yml
🧰 Additional context used
📓 Path-based instructions (1)
components/**
📄 CodeRabbit inference engine (AGENTS.md)
Place LiveKit ESP32 SDK sources and supporting components in the
components/directory
Files:
components/livekit/core/peer.ccomponents/livekit/core/engine.c
🧬 Code graph analysis (1)
components/livekit/core/engine.c (1)
components/livekit/core/signaling.c (1)
signal_send_update_subscription(420-436)
🔇 Additional comments (3)
components/livekit/core/engine.c (2)
91-98: LGTM — clean fixed-size multi-subscription state.The
MAX_AUDIO_SUBSCRIPTIONSlimit with a parallel count field is a straightforward approach for this embedded context.memset(&eng->session, 0, ...)incleanup_previous_connection(Line 766) correctly resets both the array and the count on reconnect.
153-186: LGTM — deduplication and subscription logic is correct.The dedup check before the capacity check is properly ordered (avoids counting an already-subscribed track against the limit).
strlcpysafely handles SIDs that might exceed the buffer. One minor note: whenENGINE_ERR_MAX_SUBis returned at Line 177, any remaining tracks in the list (even if they'd be deduplicated/skipped) are not iterated, but since only audio tracks are relevant and the limit is legitimately reached, this is acceptable behavior.components/livekit/core/peer.c (1)
486-501: LGTM — transformer registration is properly guarded.Only registered for publisher role with a non-NONE audio codec. Failure is logged as a warning without blocking peer creation, which is the right trade-off for an optional feature.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
|
Hi @martinadamsdev, thanks for your contribution! Since this requires a minor version bump of esp_peer, I’ve opened a separate PR for that. Once that is merged, we can move forward with this PR. |
7809e0c to
df19489
Compare
… extension When multiple devices join the same LiveKit room, they cannot hear each other because subscribe_tracks() only subscribes to the first audio track and breaks. This removes the single-track limitation, allowing subscription to up to MAX_AUDIO_SUBSCRIPTIONS (4) audio tracks with deduplication. Additionally, injects RFC 6464 ssrc-audio-level RTP header extension into publisher SDP and outgoing RTP packets to support LiveKit Active Speaker detection. Requires esp_peer >= 1.3.0 for the RTP transformer API.
df19489 to
ecb1557
Compare
|
Hi @ladvoc, thanks for splitting out the esp_peer upgrade into #70! Now that #70 is merged, I've rebased this PR onto the latest Ready for your review when you get a chance! |
Summary
subscribe_tracks(). Devices in the same room can now hear each other (previously only the first audio track was subscribed, causing device-to-device audio to fail while web clients worked fine).ssrc-audio-levelextmap into publisher SDP and RTP packets via esp_peer's RTP transformer API, enabling LiveKit Active Speaker detection.~1.2.7→>=1.3.0(required foresp_peer_set_rtp_transformerAPI).Changes
engine.csession_state_tnow tracks up to 4 subscribed audio track SIDs with deduplication;subscribe_tracks()no longer breaks after first track;handle_join()no longer exits loop onENGINE_ERR_MAX_SUBpeer.csdp_inject_audio_level_extmap()for SDP patching; new RTP transformer callbacks for injecting RFC 5285/6464 audio level header extension; publisher offer SDP is patched before sendingidf_component.ymlesp_peerversion bumped to>=1.3.0Test plan
Summary by CodeRabbit
New Features
Chores